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Abstract 

State-dependent  Riccati  equation  (SDRE)  techniques  are  rapidly  emerging  as  general  design 
and  synthesis  methods  of  nonlinear  feedback  controllers  and  estimators  for  a  broad  class  of 
nonlinear  regulator  problems.  In  essence,  the  SDRE  approach  involves  mimicking  standard 
linear  quadratic  regulator  (LQR)  formulation  for  linear  systems.  In  particular,  the  technique 
consists  of  using  direct  parameterization  to  bring  the  nonlinear  system  to  a  linear  structure 
having  state-dependent  coefficient  matrices.  Theoretical  advances  have  been  made  regarding 
the  nonlinear  regulator  problem  and  the  asymptotic  stability  properties  of  the  system  with 
full  state  feedback.  However,  there  have  not  been  any  attempts  at  the  theory  regarding  the 
asymptotic  convergence  of  the  estimator  and  the  compensated  system.  This  paper  addresses 
these  two  issues  as  well  as  discussing  numerical  methods  for  approximating  the  solution  to  the 
SDRE.  The  Taylor  series  numerical  methods  works  only  for  a  certain  class  of  systems,  namely 
with  constant  control  coefficient  matrices,  and  only  in  small  regions.  The  interpolation  numerical 
method  can  be  applied  globally  to  a  much  larger  class  of  systems.  Examples  will  be  provided 
to  illustrate  the  effectiveness  and  potential  of  the  SDRE  technique  for  the  design  of  nonlinear 
compensator-based  feedback  controllers. 

1  Introduction 

Linear  quadratic  regulation  (LQR)  is  a  well  established,  accepted,  and  effective  theory  for  the 
synthesis  of  control  laws  for  linear  systems.  However,  most  mathematical  models  for  biological 
systems,  including  HIV  dynamics  with  immune  response  [4,  17],  as  well  as  those  for  physical 
processes,  such  as  those  arising  in  the  microelectronic  industry  [3]  and  satellite  dynamics  [22],  are 
inherently  nonlinear.  A  number  of  methodologies  exist  for  the  control  design  and  synthesis  of  these 
highly  nonlinear  systems.  These  techniques  include  a  large  number  of  linear  design  methodologies 
[33,  15]  such  as  Jacobian  linearization  and  feedback  linearization  used  in  conjunction  with  gain 
scheduling  [25].  Nonlinear  design  techniques  have  also  been  proposed  including  dynamic  inversion 
[27],  recursive  backstepping  [18],  sliding  mode  control  [27],  and  adaptive  control  [18].  In  addition, 
other  nonlinear  controller  designs  such  as  methods  based  on  estimating  the  solution  of  the  Hamilton- 
Jacobi-Bellman  (HJB)  equation  can  be  found  in  a  comprehensive  review  article  [5].  Each  of  these 
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techniques  has  its  set  of  tuning  rules  that  allow  the  modeler  and  designer  to  make  trade-offs  between 
control  effort  and  output  error.  Other  issues  such  as  stability  and  robustness  with  respect  to 
parameter  uncertainties  and  system  disturbances  are  also  features  that  differ  depending  on  the 
control  methodology  considered. 

One  of  the  highly  promising  and  rapidly  emerging  methodologies  for  designing  nonlinear  con¬ 
trollers  is  the  state-dependent  Riccati  equation  (SDRE)  approach  in  the  context  of  the  nonlinear 
regulator  problem.  This  method,  which  is  also  referred  to  as  the  Frozen  Riccati  Equation  (FRE) 
method  [11],  has  received  considerable  attention  in  recent  years  [9,  10,  12,  26].  In  essence,  the 
SDRE  method  is  a  systematic  way  of  designing  nonlinear  feedback  controllers  that  approximate 
the  solution  of  the  infinite  time  horizon  optimal  control  problem  and  can  be  implemented  in  real¬ 
time  for  a  broad  class  of  applications.  Through  extensive  numerical  simulations,  the  SDRE  method 
has  demonstrated  its  effectiveness  as  a  technique  for,  among  others,  controlling  an  artificial  hu¬ 
man  pancreas  [23],  for  the  regulation  of  the  growth  of  thin  films  in  a  high-pressure  chemical  vapor 
deposition  reactor  [3,  2,  30],  and  for  the  position  and  attitude  control  of  a  spacecraft  [28].  More 
specifically,  recent  articles  [3,  2]  have  reported  on  the  successful  use  of  SDRE  in  the  development 
of  nonlinear  feedback  control  methods  for  real-time  implementation  on  a  chemical  vapor  deposition 
reactor.  The  problems  are  optimal  tracking  problems  (for  regulation  of  the  growth  of  thin  films  in 
a  high-pressure  chemical  vapor  deposition  (HPCVD)  reactor)  that  employ  state-dependent  Riccati 
gains  with  nonlinear  observations  and  the  resulting  dual  state-dependent  Riccati  equations  for  the 
compensator  gains. 

Even  though  these  computational  efforts  are  very  promising,  the  present  investigation  opens 
a  host  of  fundamental  mathematical  questions  that  should  provide  a  rich  source  of  theoretical 
challenges.  In  particular,  much  of  the  focus  thus  far  has  been  on  the  full  state  feedback  theory,  the 
implementation  of  the  method,  and  numerical  methods  for  solving  the  SDRE  with  a  constant  control 
coefficient  matrix.  In  most  cases,  the  theory  developed  also  involves  using  nonlinear  weighting 
coefficients  for  the  state  and  control  in  the  cost  functional  to  produce  near  optimal  solutions.  This 
methodology  is  quite  useful  and  also  quite  difficult  to  implement  for  complex  systems.  Therefore, 
it  is  of  general  interest  to  explore  the  use  of  constant  weighting  matrices  to  produce  a  suboptimal 
control  law  that  has  the  advantage  of  ease  of  configuration  and  implementation.  In  addition,  the 
development  of  a  comprehensive  theory  is  needed  for  approximation  and  convergence  of  the  state- 
dependent  Riccati  equation  approach  for  nonlinear  compensation.  Finally,  a  current  approach 
for  solving  the  SDRE  is  via  symbolic  software  package  such  as  Macsyma  or  Mathematica  [9]. 
However,  this  is  only  possible  for  systems  having  special  structures.  In  [6] ,  an  efficient  computational 
methodology  was  proposed  that  requires  splitting  the  state-dependent  coefficient  matrix  A(.x)  into 
a  constant  matrix  part  and  a  state-dependent  part  as  A(x)  =  Aq  +  eA A(x).  This  method  is 
effective  locally  for  systems  with  constant  control  coefficients  and  if  the  function  Ad(x)  is  not  too 
complicated  (e.g.,  when  it  has  the  same  function  of  x  in  all  entries)  then  the  SDRE  can  be  solved 
through  a  series  of  constant-valued  matrix  Lyapunov  equations.  The  assumption  on  the  form  of 
Ad(s),  however,  does  limit  the  problems  for  which  this  SDRE  approximation  method  is  applicable. 
Another  method,  based  on  interpolation,  is  effective  for  nonconstant  control  coefficients  and  it  can 
be  applied  throughout  the  state  space.  The  interpolation  approach  involves  varying  the  SDRE  over 
the  states  and  creating  a  grid  of  values  for  the  control  u(x)  or  the  solution  to  the  SDRE  n(x). 

In  this  paper,  we  examine  the  SDRE  technique  with  constant  weighting  coefficients.  In  Sec¬ 
tion  2,  we  review  the  full  state  feedback  theory  and  prove  local  asymptotic  stability  for  the  closed 
loop  system.  A  simple  example  with  an  attainable  exact  solution  is  presented  to  verify  the  the¬ 
oretical  result.  This  example  also  exhibits  the  efficiency  of  the  method  outside  of  the  region  for 
which  the  condition  in  the  proof  predicts  asymptotic  stability.  Section  3  summarizes  two  of  the 
numerical  methods  that  are  currently  available  for  the  approximation  of  the  SDRE  solution.  Sec- 
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tion  4  presents  the  extension  of  the  SDRE  methodology  to  the  nonlinear  state  estimation  problem. 
It  includes  local  convergence  results  of  the  nonlinear  state  estimator  and  a  numerical  example. 
Section  5  addresses  the  estimator  based  feedback  control  synthesis  including  a  local  asymptotic 
stability  result  for  the  closed  loop  system  as  well  as  an  illustrative  example. 


2  Full  State  Response 

In  this  section  we  formulate  the  optimal  control  problem  where  it  is  assumed  that  all  state  variables 
are  available  for  feedback.  In  [9],  the  theory  for  full  state  feedback  is  developed  for  nonconstant 
weighting  matrices.  Here,  we  formulate  our  optimal  control  problem  with  constant  weighting 
matrices.  In  particular,  the  cost  functional  is  given  by  the  integral 

J(xq,u)  =  -  /  xT Qx  +  uT Rudt,  (1) 

2  Jt0 

where  x  G  Kn,  u  G  Q  G  Hnxn  is  symmetric  positive  semidefinite  (SPSD),  and  R  G  is 

symmetric  positive  definite  (SPD).  Associated  with  the  performance  index  (1)  are  the  nonlinear 
dynamics 

x  =  f(x)  +  B(x)u,  (2) 

where  f(x)  is  a  nonlinear  function  in  Ck  and  B(x)  G  $Pnxm  is  a  matrix- valued  function.  Rewriting 
the  nonlinear  dynamics  (2)  in  the  state-dependent  coefficient  (SDC)  form  f(x)  =  A(x)x,  we  have 

x  =  A(x)x  +  B(x)u,  (3) 


where,  in  general,  A(x)  is  unique  only  if  x  is  scalar  [9].  For  the  multivariable  case  we  consider  an 
illustrative  example,  f(x)  =  [x2,x^\T .  The  obvious  SDC  parameterization  is 


Ai(x) 


0  1 
x\  0 


However,  we  can  find  another  SDC  parameterization 


A2(x) 


x2/x\  0 

x\  0 


by  dividing  and  multiplying  each  component  of  f(x)  by  x\.  We  find  yet  another  SDC  parameteri¬ 
zation 


A3(x) 


1  +  Xl 
0 


by  adding  and  subtracting  the  term  x\x2  to  fi(x).  Since  there  exists  at  least  two  SDC  parameter- 
izations,  there  are  an  infinite  number.  This  is  true  since  for  all  0  <  a  <  1, 


aA\(x)x  +  (1  —  a)A2(yx)x  =  af(x)  +  (1  —  a)f(x)  =  f(x). 


(4) 


Remark  2.1.  Choosing  the  SDRE  parameterization 


Because  of  the  many  available  SDC  parameterizations,  as  we  design  the  control  law  we 
must  choose  the  one  that  is  most  appropriate  for  the  system  and  control  objectives  of 
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interest.  One  factor  that  is  of  considerable  importance  is  the  state- dependent  controlla¬ 
bility  matrix  (or  in  estimation  theory,  the  state- dependent  observability  matrix).  As  in 
the  linear  theory,  the  matrix  is  given  by 

M{x)=[B(x)  A(x)B{x)  ...  A^-1\x)B{x)\  . 

In  linear  system  theory,  if  M  (a  constant  in  this  case)  has  full  rank  then  the  system  is 
controllable.  In  the  nonlinear  case  we  must  seek  a  parameterization  that  gives  M(x)  full 
rank  for  the  entire  domain  for  which  the  system  is  to  be  controlled.  For  estimation  (to 
be  considered  in  Section  4),  one  must  consider  the  state- dependent  observability  matrix, 


0{x) 


C(x) 

C(x)A(x) 

C(x)An~1(x) 


when  choosing  the  parameterization. 

The  Hamiltonian  for  the  optimal  control  problem  (l)-(3)  is  given  by 

H(x,  u,  A)  =  ~(xTQx  +  uT Ru)  +  XT (A(x)x  +  B(x)u). 

From  the  Hamiltonian,  the  necessary  conditions  for  the  optimal  control  are  found  to  be 


x  =  -Qx-\?PEM 

L  OX 

x  =  A(x)x  +  B(x)u, 


X- 


d  ( B(x)u ) 
dx 


A, 


and 


(5) 

(6) 
(7) 


0  =  Ru  +  BT  {x)X. 


(8) 


Let  Aj:  denote  the  ith  row  of  A{x)  and  denote  the  ith  row  of  B(x).  Then 


d  (A(x)x) 
dx 


d(A(x)) 

A{x)  +  V  "x 


A(x)  + 


dx 

dx\  J 

dAn:  r 
■  dx\  " 


9A1: 
dXn  ' 


dAn[ 

dxn 


X 


(9) 


and 


d  ( B(x)u ) 
dx 


■ft”  ■■ 

dBu 

dxn 

dBn, 

ldXlU  ’  ' 

dBn, 

dxn 

u 


(10) 


Mimicking  the  Sweep  Method  [19],  the  co-state  is  assumed  to  be  of  the  form  A  =  H{x)x  (note 
the  state  dependency).  Using  this  form  for  A  in  equation  (8)  we  obtain  a  feedback  control  u  = 
— R~1Bt(x)TI(x)x.  Substituting  this  control  back  into  (7),  we  find  x  =  A(x)x—B(x)R^1BT(x)Il(x)x 
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To  find  the  matrix- valued  function  II(x),  we  differentiate  A  =  II(x)x  with  respect  to  time  along  a 
trajectory  to  obtain 

a  =  n(x)  x  +  II(x)x  ,  . 

=  II(x)x  +  Tl(x)A(x)x  —  U(x)B(x)R~1Bt(x)U(x)x, 


where  we  use  the  notation 


n(x)  =  Xi(x)xi(t). 


i=  1 


If  we  set  this  equal  to  A  from  (6),  we  find 

II(x)x  +  II  (x)A(x)x  —  U(x)B(x)R~1Bt(x)U(x)x 


=  -Qx  - 

Rearranging  terms  we  find 


d(A(x)) 

A(x)  +  V  x 
ox 


U(x)x  — 


d  ( B(x)u ) 
Ox 


II(x)x. 


ri(x)  + 


d  (^4(x)) 
dx 


lT 


n(x)  + 


d  ( B(x)u ) 
dx 


n(x) 


+  (n(x)A(x)  +  AT{x)U{x)  -  U(x)B(x)R~1Bt{x)U{x)  +  Q) 


x  =  0. 


If  we  assume  that  II(x)  solves  the  SDRE,  which  is  given  by 

n(x)A(x)  +  ^T(x)n(x)  -  n(x)R(x)R~1RT(x)n(x)  +  q  =  o, 
then  the  following  condition  must  be  satisfied  for  optimality: 


II(x)  + 


d  (A(x)) 
dx 


n(x)  + 


d  ( B(x)u ) 
dx 


n(x)  =  o. 


(12) 


(13) 


To  be  consistent  with  [9],  we  will  refer  to  (13)  as  the  Optimality  Criterion.  The  suboptimal  control 
for  (1)  and  (3)  is  found  by  solving  (12)  for  II(x).  Then,  the  optimal  control  problem  has  the 
suboptimal  solution 

u  =  —K(x)x  where  K(x)  =  R^1  BT  (x)U(x) .  (14) 

In  [9],  a  methodology  is  presented  for  forming  a  state-dependent  weighting  matrix  Q(x)  so  as 
to  find  an  optimal  control  solution.  This  methodology  is  useful  but  somewhat  difficult  for  complex 
systems.  Therefore,  we  focus  on  the  suboptimal  control  law  that  is  useful  for  all  systems  and  has 
the  benefit  of  ease  of  implementation.  The  focus  for  the  remainder  of  the  section  will  be  the  local 
asymptotic  convergence  of  the  state  using  the  SDRE  solution  to  formulate  the  suboptimal  control. 
Some  of  the  nomenclature  associated  with  the  SDC  parameterization  is  now  given  to  aid  in  the 
presentation  of  our  theoretical  results. 

Definition  2.1  (Detectable  Parameterization).  ^4(x)  is  a  detectable  parameterization  of  the 
nonlinear  system  in  Ll  if  the  pair  ( Q ,  A(x))  is  detectable  for  all  x  G  D. 

Definition  2.2  (Stabilizable  Parameterization).  A{x)  is  a  stabilizable  parameterization  of  the 
nonlinear  system  in  D  if  the  pair  (A(x),  B(x))  is  stabilizable  for  all  x  €  LI. 
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We  now  present  a  lemma  relating  the  linearization  of  the  original  system  about  the  origin  and 
the  SDC  parameterization  evaluated  at  zero.  To  motivate  the  lemma,  consider  the  following  simple 
example: 

X1X2  sin(x2)  +  x2  cos(xi)] 


X  =  f{x )  = 

Two  obvious  choices  for  SDC  parameterizations  of  this  system  are 

Ai(x)  = 

and 


x{  +  2X2 


x2sin(x2)  cos(xi) 

xi  2 


A2  (x)  = 

The  gradient  of  /(x)  is  given  by 
V/(x)  = 


0  x\  sin(x2)  +  cos(xi) 
xi  2 


x2  sin(x2)  —  x2  sin(xi)  xi  sin(x2)  +  xix2  cos(x2)  +  cos(xi) 
2xi  2 


Since  the  linearization  of  /  at  the  origin  is  the  gradient  evaluated  at  zero,  we  obtain 

^i(O)  =  A2  (0)  =  V/(0)  =  [°  2 


Generalizing  this  result,  we  have  that: 

Lemma  2.1.  For  any  SDC  parameterization  A{x)x,  A(0)  is  the  linearization  of  /(x)  about  the 
zero  equilibrium. 

Proof.  Let  Ai(x)  and  ^42(x)  be  two  distinct  parameterizations  of  /(x)  and  let  ^4(x)  =  A\{x)  —  A2{x). 
Then  ^4(x)x  =  0  for  all  x  and 


dA{x)x 

dx 


dA(x) 

A(x)  H - - — -x  =  0. 

ox 


Then,  because  the  second  term  on  the  right  is  zero  at  x  =  0,  it  follows  that  ^4(0)  =  0  which  implies 
^4i(0)  =  A2(0).  Therefore  we  have  that  the  parameterization  evaluated  at  zero  is  unique.  Without 
loss  of  generality,  consider  the  parameterization  given  by  yli(x).  The  linearized  system  is 


i  =  V/(0)z. 


But 

„ ,,  ,  .  .  .  dAAx) 

V/(x)  =  Ai(x)  H - ~Qf~x 

so  V/( 0)  =  Ai(0)  which  has  been  shown  to  be  unique  for  all  parameterizations. 

□ 

Remark  2.2.  It  is  assumed  that  the  solution  to  the  SDRE  (12)  exists  for  all  x  in  the  neighborhood 
of  the  origin  being  considered  {and,  naturally,  {A{x),  B{x))  is  a  stabilizable  parameterization ). 
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Thus,  it  is  of  logical  consequence  that  the  solution  exists  at  x  =  0  and  that  IIo  =  11(0)  solves 
the  linear  system  algebraic  Riccati  equation  (ARE) 

IIoA)  +  A^n0  -  U0B0R  1R(flIo  +  Q  =  0,  (15) 

with  A0  =  A(0)  (here,  Aq  is  uniquely  defined  rather  than  in  [7],  where  A$  is  not  necessarily  A(0)) 
and  B0  =  B{ 0).  Thus,  a  natural  and  useful  representation  of  the  matrix  II(x)  is  the  representation 

n(z)  =  n0  +  An(x),  (ie) 

where  An(x)  =  n(x)  —  n(0)  and  AII(O)  =  0.  Likewise,  A(x)  and  B(x )  can  be  rewritten  using 
the  constant  matrices  Aq  and  Bq  and  nonconstant  matrices  A A{x)  and  A B(x)  (defined  in  a  way 
similar  to  AII(x))  as 

A{x)  =  A0  +  AA(x),  (17) 

and 

B(x)  =  Bq  +  AB(x),  (18) 

with  AA(0)  =  0  and  AB(0)  =  0.  This  leads  to  the  control  u  being  represented  as  a  the  sum  of  a 
constant  matrix  and  an  incremental  matrix, 

u{x)  =  ~(K0  +  A  K(x))x,  (19) 

where 

Ko  =  R~lBl  no,  (20) 

and 

A  K{x)  =  R~1{Bt{x)AU{x)  +  ABt(x)U0).  (21) 

By  construction  An(x)  and  A B(x)  are  zero  at  the  origin  so  that  AK{0 )  =  0.  Under  continuity 
assumptions  on  A{x)  and  B(x),  along  with  the  assumption  that  the  SDC  parameterization  is  a 
detectable  and  stabilizable  parameterization,  it  follows  that  n(x)  is  continuous.  This  follows  from 
(see  page  315,  [24]) 

Theorem  2.1.  The  maximal  hermitian  solution  II  of  the  constant  ARE 

n A  +  AtU  -  IiBR-lBTIi  +  Q  =  0 

is  a  continuous  function  of  w  =  (BR~1BT ,  A,  Q) . 

Hence,  since  we  require  that  A{x)  and  B(x)  are  continuous,  it  can  be  concluded  that  the 
solution  to  the  SDRE,  n(.x),  is  also  continuous.  Therefore,  the  norms  the  incremental  matrices 
A A(x),  A B(x),  and  A K(x)  are  small  in  a  neighborhood  of  zero.  For  the  remainder  of  this  chapter, 
we  denote  the  e-ball  around  z  as 


Be(z )  =  {x  :  ||x  —  z\\  <  e}. 

With  the  use  of  the  ideas  in  the  above  discussions,  the  following  theorem  can  be  proven: 
Theorem  2.2.  Assume  that  the  system 

x  =  f(x)  +  B(x)u 

is  such  that  f(x)  and  ( j  =  1  are  continuous  in  x  for  all  ||x||  <  f,  r  >  0,  and  that 

f(x)  can  be  written  as  f(x)  =  A(x)x  (in  SDC  form).  Assume  further  that  A(x)  and  B(x)  are 
continuous  and  the  system  defined  by  (1)  and  (3)  is  a  detectable  and  stabilizable  parameterization 
in  some  nonempty  neighborhood  of  the  origin  D  C  0).  Then  the  system  with  the  control  given 
by  (14)  is  locally  asymptotically  stable. 
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Proof.  Let  r  >  0  be  the  largest  radius  such  that  Br{ 0)  C  17.  By  the  assumption  that  the  system 
is  stabilizable  at  x  =  0  we  can  use  LQR  theory  to  create  a  matrix  Kq  such  that  all  eigenvalues  of 
Ao  —  BqKo  have  negative  real  parts.  This  implies  the  existence  of  (3  >  0  such  that  Re{ A)  <  —  /3  for 
all  eigenvalues  (A)  of  Ao  —  BqKq.  Using  the  maps  defined  by  (17),  (18),  and  (19),  we  have  that  the 
controlled  nonlinear  dynamics  can  be  rewritten  in  the  form 

x  =  A(  x)x  —  B(x)K(x)x 

=  (Ao  +  A A(x))x  -  (B0  +  AB(x))(K0  +  A K(x))x 
=  (Ao  —  BqKq)x  +  (A  A(x)  —  B(x)AK(x)  —  AB(x)Kq)x. 

If  we  let 

g{x)  =  AA(x)  -  B{x)AK{x)  -  A B(x)K0  (22) 

and  h(x)  =  g(x)x  then  the  system  is  given  by 

x  =  (A0  -  B0K0)x  +  h(x).  (23) 

Examination  of  h(x)  reveals  that  we  are  dealing  with  an  almost  linear  system  satisfying  the  property 


lim 


11^)11 

11*11 


=  0. 


This  is  easily  deduced  from  the  inequality 


IIM*)II  <  \\s(x)  II  11*11  <  (||AA(x)||  +  \\B(x)  || \\AK(x)  ||  +  ||AH(x)||||/v0||)||x||, 


(24) 


where  the  norms  of  the  incremental  matrices  are  known  to  be  continuous  and  equal  to  zero  at  the 
origin.  Since  B(x)  is  continuous,  the  norm  of  B{x)  is  bounded  for  any  bounded  region.  Hence, 
||g(x)||  — >  0  as  || a: ||  — >  0  and,  hence,  h(x)  satisfies  condition  (24).  From  theoretical  results  on  almost 
linear  systems  [8],  we  know  that  if  the  eigenvalues  of  Ao  —  BqKq  have  negative  real  parts,  h(x)  is 
continuous  around  the  origin,  and  condition  (24)  holds,  then  x  =  0  is  asymptotically  stable.  We 
outline  the  known  proof  of  this  result  for  our  specific  system  since  it  will  serve  as  a  template  for 
the  subsequent  arguments  presented  below. 

Let  C  >  0  be  given.  Then,  there  exists  a  <5  £  (0,r)  such  that  ||/i(a;)||  <  Clip’ll  whenever  ||x||  <  6. 
Let  x(0)  =  xo  £  13f(0).  By  the  assumptions  on  /  and  by  continuity,  the  solution  exists  and  remains 
in  £>f(0)  at  least  until  some  time  t  >  0.  For  t  £  [0,  t)  we  can  then  express  the  solution  to  (23)  using 
the  variation  of  constants  formula  to  obtain 

x(t)  =  e^A°~B°K°^tx(0)  +  Jq  e^A°~B°K°^t~s^h(x(s))  ds. 

Taking  the  norm  of  both  sides  we  have 

||x(t)  ||  <  ||e(j4o~BoA°)*|j||x(0)||  +  (  fy  ||e(Ao~'BoA°)(t_^||||x(s)||  d-s 

for  as  long  as  ||x(t)||  <  5.  There  exists  a  positive  constant  G  such  that 

||e(A0— B0A'0)t ||  Qe~0t 


so  that 


x 


<  Ge~pt\\x{0)\\  +  (G  f  e-W-s\\x(s)\\ds. 
Jo 
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Multiplying  by  and  invoking  the  Gronwall  inequality,  we  find 

ept\\x{t)\\  <  G||x(0)||eCGf 
or 

\\x(t)\\  <  G||x(0)||e-^-CG^  (25) 

for  all  f  >  0  such  that  ||x(i)||  <  5.  We  restrict  our  initial  condition  domain  further  by  selecting 
C  £  (0 ,/3/G).  Then,  with  5  corresponding  to  this  C,  given  e  €  (0,5]  we  select  5  =  min{5,  e/G}. 
Then,  for  xo  £  £><5(0),  we  have  that  ||x(f)||  <  e  <  5  for  all  t  >  0  and  x  =  0  is  stable.  Moreover, 
(25)  holds  for  all  t  >  0  if  xo  £  £5^(0) .  Since  f3  —  C,G  >  0  we  have  x  =  0  is  in  fact  asymptotically 
stable.  □ 


2.1  Example  1:  Exact  Solution 


In  this  section  we  consider  a  simple  example  from  [20]  that  has  an  exact  solution.  This  example 
is  of  particular  interest  because  the  numerical  method  in  [7]  failed  to  produce  a  stabilizing  control 
based  on  the  approximate  SDRE  solution.  The  cost  functional  for  the  example  under  consideration 
is 

1 


J(xq,u)  =  / 

Jo 

with  associated  nonlinear  state  dynamics 


r3  O' 

0  i 


x  +  -u~ 


dt, 


xi 

X2 


X2 

X? 


+ 


u. 


(26) 


(27) 


An  SDC  parameterization  is  given  by 


X'l 

'0 

1' 

X'l 

+ 

O' 

X'2_ 

„.2 

Lxi 

0 

_X'2_ 

1 

The  resulting  constant  and  incremental  matrices  (17)  and  (18)  have  the  form 


An  = 


0  1 
0  0 


A  A(x)  = 


0  0 
x?  0 


Bq  =  B,  and  A B{x)  =  0.  This  parameterization  has  state-dependent  controllability  matrix  given 
by 

,  To  l 

Mix)  =  1  Q 

which  has  full  rank  for  all  x  £  K2.  Therefore,  the  SDC  parameterization  is  such  that  (A(x),  B(x))  is 
controllable  for  all  x  and  ( Q 1/2,  A(x))  is  observable  for  all  x.  Hence,  we  can  assume  that  the  SDRE 
solution  can  be  used  to  construct  a  locally  stabilizing  suboptimal  control  via  (14).  The  SDRE  for 
this  SDC  parameterization  is  given  by 


n 


o  1 

x2  0 


+ 


0  xf 

1  0 


n  -  n 


2  [o  i]  n  + 


ri  o 

2  u 

0  M 


=  0, 


where  n(x)  is  a  symmetric  matrix.  The  solution  to  the  SDRE  has  the  form 

Ki+Va;i+1  T  lA  x\+\Jx\+2 


n(x)  = 


y/x\  +  l 


+  4 


2 


spTABA  +  { 
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By  setting  the  state  to  zero  we  find  that  the  solution  to  the  SDRE  at  the  origin  (15)  is 

~C3  V2~ 

n°=  h  k 

.2  2  _ 

and  An(x)  =  II(x)  —  IIo.  Since  A B(x)  =  0,  A K(x)  =  R~1BT AII(x)  and  we  have  that  g{x)  = 
||AH(x)||  +  ||R|| \\AK (x)||.  For  this  example,  we  find  that  the  eigenvalues  of  the  closed  loop  system 
are  —0.866  ±  0.5i.  In  Figure  1(a)  we  see  that  if  we  let  G  =  4  and  /3  =  0.866  then  we  have  found  a 
bound  of  the  form  ||e(j4°_so-Ko)£||  <  Ge~^.  Figure  1(b)  is  a  plot  of 

y  =  4g(x)  -  \Re(X)\  (28) 

vs.  x'i  (since  A A(x)  and  A K(x)  are  both  dependent  only  on  x\)  where  we  have  used  the  ||  •  || 2 
norm  for  g.  This  gives  us  an  idea  of  the  region  from  which  we  can  choose  initial  conditions  so 
that  —  (P  —  (G)  <  0.  By  approximating  the  zeros  of  y  we  see  that  \x\\  <  0.31  produces  negative 
values  for  —  (/?  —  £G).  Thus,  if  the  initial  condition,  xq,  is  in  60.31  (0),  we  are  guaranteed  decay  to 
zero  at  an  exponential  rate.  However,  we  shall  show  through  numerical  examples  that  the  region 
of  attraction  for  x  =  0  is  much  larger. 


Figure  1:  (a)  Different  values  for  G  in  Ge  ^  and  (b)  plot  of  y  (28)  vs.  x\,  y  being  negative  (shaded) 
indicates  region  of  attraction. 

We  first  consider  the  system  behavior  when  the  initial  condition  is  xq  =  (1,  —  1)T.  Figure  2(a) 
depicts  the  state  dynamics  of  the  controlled  system  while  Figure  2(b)  is  the  norm  of  the  state 
dynamics.  Not  only  do  the  dynamics  exhibit  decay  to  zero,  but  the  graph  of  the  norm  exhibits 
exponential  decay.  Figure  2(c)  is  the  control  and  Figure  2(d)  is  the  value  of  the  cost  functional 
integrand  over  time. 

The  second  initial  condition  that  we  consider  is  xo  =  (5,0)T.  We  find  that,  despite  this  initial 
condition  being  farther  from  the  origin,  the  system  also  converges  to  zero  asymptotically.  Fig¬ 
ures  3(a),  3(b),  3(c),  and  3(d)  correspond  to  the  state,  the  norm  of  the  state,  the  control  and  the 
cost  functional  integrand  for  the  system  with  this  initial  condition.  The  control  effort  required  to 
stabilize  the  system  is  large  relative  to  that  of  the  previous  initial  condition.  In  fact,  the  cost  for 
the  entire  time  interval  is  J  =  4023.3,  whereas  the  cost  for  the  first  initial  condition  is  0.6346. 
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(a) 


(b) 


(c)  (d) 


Figure  2:  Full  state  feedback  with  exact  SDRE  solution,  (a)  closed  loop  state  dynamics,  (b)  norm 
of  the  state  dynamics,  (c)  SDRE  formulated  control  u,  and  (d)  cost  functional  integrand  over  the 
time  span  0  <  t  <  10  with  the  initial  condition  xq  =  (1,  —  1)T. 
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(a)  (b) 


(c)  (d) 


Figure  3:  Full  state  feedback  with  exact  SDRE  solution,  (a)  closed  loop  state  dynamics,  (b)  norm 
of  the  state  dynamics,  (c)  SDRE  formulated  control  u,  and  (d)  cost  functional  integrand  over  the 
time  span  0  <  t  <  10  with  the  initial  condition  xq  =  (5,  0)T. 
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3  Numerical  Methods  for  Solving  SDRE 

In  Section  2.1,  we  presented  an  example  for  which  the  solution  of  the  SDRE  could  be  found 
analytically.  In  general  one  cannot  find  an  exact  solution  analytically.  Currently,  one  approach  for 
solving  the  SDRE  is  via  symbolic  software  packages  such  as  Macsyma  or  Mathematica.  However, 
once  the  dynamics  of  the  system  become  complicated  it  is  difficult  to  obtain  a  solution  in  this  manner 
and  it  becomes  necessary  to  approximate  the  solution.  The  method  we  describe  in  Section  3.1, 
referred  to  as  the  Taylor  series  method  [7],  works  for  systems  with  constant  control  coefficients  ( B 
is  not  dependent  on  x).  This  uses  the  methodology  of  Taylor  series  approximations  and  is  only 
effective  locally.  The  interpolation  method,  presented  in  Section  3.2,  can  be  used  for  more  complex 
systems.  This  method  involves  varying  the  state  over  the  domain  of  interest,  solving  for  and  storing 
the  control  u{x),  the  SDRE  solution  n(a;),  or  the  SDRE  gain  K{x)  in  a  grid  and  interpolating  over 
the  stored  solutions  to  approximate  the  control. 
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Grouping  like  powers  of  e  and  setting  the  coefficients  to  zero,  we  obtain  an  iterative  method  for 
finding  the  Ln  matrices.  This  iteration  scheme  is 


L0A  +  AtL0  -  L0BR~1BtL0  +  Q 

=  o, 

(30) 

Li(A  -  BR~1BtLq)  +  ( AT  -  BR~1BtL0)L1  +  L0AA  +  AAL0 

=  o, 

(31) 

Ln(A  -  BR~1BtL0)  +  (AT  -  BTR-1BL0)Ln  +  Ln_xAA 
+AAT Ln_i  -  E'felJ  LkBR~1BTLn_k 

=  0. 

(32) 

We  note  that  equation  (30)  is  the  ARE  corresponding  to  ( A,  B )  while  equations  (31)  and  (32) 
are  state-dependent  Lyapunov  equations.  This  algorithm  converges  locally  to  the  solution  of  the 
SDRE  under  the  assumptions  that  A(x)  and  B(x)  are  continuous  [32],  This  system  of  equations 
simplifies  even  further  if  A A{x)  =  g(x)AAc,  where  A Ac  is  a  constant  matrix.  By  defining 

Ln{x)  =  (g(x))n(Ln)c, 

with  ( Ln)c  a  constant  matrix,  we  obtain  the  simplified  iterative  scheme 


LqA  +  AtL0  -  L0BR~1BtLo  +  Q 
(Li)c(i  -  BR-1BtL0)  +  ( AT  -  BR-1BtL0)(L1)c  +  L0AAC  +  AACL0 
{Ln)c{A  -  BR~1BtL0)  +  ( AT  -  BTR~1BL0)(Ln)c  +  (Ln-i)cAAc 
+AATc{Ln_1)c  -  Z'kZl(Lk)cBR-1BT(Ln_k)c 


=  0, 

=  0, 

=  0. 


Therefore,  when  A{x)  is  of  this  type,  we  can  approximate  II(x)  using  constant  matrices  calculated 
offline  by  solving  one  constant  ARE  and  a  series  of  constant  Lyapunov  equations.  The  subsequent 
control  (an  approximation  of  (14))  is  given  by 


uN(x )  =  —  R 


(0 g(x))n(Ln)c )  X, 


where  N  is  the  number  of  members  of  the  series  computed  offline. 

If  solving  the  Lyapunov  equations  in  real-time  is  not  feasible  and  A A{x)  does  not  take  on  the 
form  described  above,  only  Lq  and  L\  can  be  computed  numerically  offline  (without  the  use  of 
symbolic  solutions  to  the  state-dependent  Lyapunov  equations).  To  achieve  this,  we  rewrite 

m 

A(x)  =  A  +  fj(x){AAj)c, 

3= 1 

where  fj(x)  are  the  distinct  nonlinear  terms  of  A(x)  and  (A Aj)c  are  the  corresponding  constant 
matrix  coefficients.  If  we  assume  L\  can  be  written  as 


Li  =  ^2fj{x){L{)C, 

3=1 

group  the  fj(x)  terms  together  for  each  j,  and  set  the  coefficients  to  zero,  we  can  approximate  the 
solution  to  the  SDRE  by  solving 

LqA  AA Lq  —  LqBR  ^ B^ Lq  +  Q  =  0, 

{L[)c{A  -  BR~1BtLq)  +  (iT  -  BR-1BtLq){L[)c 

+Lq(AA3c)  +  (AAjc)Lq  =  0,  j  =  1, . . . ,  m. 


SDRE  BASED  FEEDBACK  CONTROL,  ESTIMATION,  AND  COMPENSATION  15 


The  corresponding  approximate  control  ( ul  denoting  that  we  use  one  term,  in  addition  to  Lq ,  of 
the  Taylor  series)  is  given  by 


^(x)  =  -R~1Bt 


^Lo  +  XJLi Mx)j  x 


3.2  Interpolation  Method 

The  interpolation  method,  alluded  to  in  [21],  involves  formulating  the  SDRE  control  («(x)),  SDRE 
solution  (II(x)),  or  even  the  SDRE  gain  matrix  (K(x))  at  a  number  of  states  in  the  domain  of 
consideration  and  creating  a  grid  to  interpolate  over  as  the  states  vary.  We  are  able  to  construct 
an  interpolation  grid  because  the  solutions  to  the  SDRE  continuously  depend  on  the  state.  This 
technique  is  similar  in  spirit  to  gain  scheduling  and  an  optimal  control  two  point  boundary  value 
problem  described  in  [16].  We  now  proceed  to  develop  the  method  mathematically. 


3.2.1  Interpolate  the  Control 

Let  Do  denote  the  set  from  which  the  initial  conditions  are  chosen.  We  assume  that  the  state  space, 
D,  is  such  that  the  state  trajectory  for  any  initial  condition,  xq  £  Do,  is  contained  entirely  in  D.  If 
n  is  the  dimension  of  the  state,  we  create  a  finite  mesh,  D,  that  varies  each  x*,  i  =  1, . . .  ,n  over 
D.  At  each  x  £  D,  solve  and  store  the  resulting  control,  u±-  Thus,  we  have  effectively  created  an 
interpolation  grid  that  can  be  used  to  estimate  the  control  for  any  point  in  D.  In  other  words,  the 
control  can  be  approximated  by 

u(x)  =  interplu-j} 

where  interp{*}  represents  any  type  of  interpolation  (1-d,  2-d,  cubic,  linear,  etc.). 


3.2.2  Interpolate  over  II(x) 

In  SDC  form,  we  find  that  the  number  of  state  variables  on  which  A(x)  and  B(x)  depend  is  often 
less  than  the  dimension  of  the  system.  Let  Do  and  D  be  defined  as  above.  If  n  is  the  dimension  of 
the  state,  we  denote  the  r  <n  states  present  in  A(x)  and  B(x)  as  H  =  {xq,  ...,  Xjr}.  If  we  create  a 
mesh,  D,  that  varies  over  each  xq  £  5  over  the  bounds  imposed  by  D,  solve  and  store  the  resulting 
constant  SDRE  for  each  x  £  D,  we  have  effectively  created  an  interpolation  grid  for  each  element 
of  II(x).  Thus,  to  estimate  the  solution  to  the  SDRE  for  any  point  in  D,  we  can  interpolate  over 
the  elements  of  the  grid.  Thus,  we  can  approximate  the  control  with 


u(x) 


-R~1BT(x) 


interp{7rn(x)} 


interp{7r„i(x)} 


interp{7Tin(x)} 


interp{7rnn(x)} 


x, 


where  interpj*}  is  defined  as  above.  Since  II(x)  is  symmetric,  we  have  interp{7r?y}  =  interp{7Tjj} 
and  the  number  of  r-degree  interpolations  is  then  n(n  +  l)/2. 

Remark  3.1.  We  make  the  following  comments  regarding  the  interpolation  method: 


(i)  The  number  of  points  ( M )  used  for  the  mesh  should  be  sufficiently  large  so  that 
the  closed  loop  system  is  approximated  accurately  (otherwise,  there  might  be  in¬ 
stability). 
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(ii)  Interpolation  over  II(x)  is  advantageous  when  the  number  of  states,  r,  in  H  is  less 
than  n.  The  degree  of  the  interpolation  is  less  for  each  element  of  II(x)  and  the 
elements  not  present  in  H  do  not  have  bounds  invoked  by  the  mesh. 

(iii)  Interpolation  over  u(x)  is  advantageous  when  the  number  of  states,  r,  in  H  is  equal 
to  n  since  both  meshes  will  impose  bounds  on  all  states  and  the  interpolation  is  of 
the  same  complexity. 

(iv)  If  nm  <  n(n  +  1) / 2,  where  m  is  the  dimension  of  the  control,  then  the  interpo¬ 
lation  method  can  be  used  with  the  gain  matrix,  K(x)  to  decrease  the  number  of 
interpolations  needed. 

(iv)  Not  only  can  the  interpolation  method  be  used  for  nonconstant  B(x),  but  it  can 
also  be  used  in  more  complex  SDRE  formulations  that  have  nonconstant  weighting 
matrices,  Q(x )  and  R(x),  in  the  cost  functional. 


3.3  Example  1  Revisited:  Numerical  Approximations 

In  Section  2.1  the  exact  solution  to  (26),  (27)  was  used  for  the  control  of  the  system.  We  now  use 
the  numerical  methods  introduced  in  the  previous  section  to  approximate  the  solution  to  the  SDRE, 
synthesize  controls  based  on  the  approximations,  and  compare  the  results  to  the  exact  solution. 
To  generate  the  Taylor  series,  we  use  the  same  representations  as  in  Section  2  to  split  A{x)  into  a 
constant  and  state-dependent  part.  Therefore,  we  have 


A 


0  1 
0  0  ’ 


AA(x) 


0  0 
1  0 


x 


2 

1- 


Since  AA(x)  =  Acg(x),  we  can  generate  as  many  terms  of  the  Taylor  series  offline  as  desired.  For 
the  interpolation  methods  (both  u{x)  and  II(x)),  we  create  four  grids  INTul,  INTu2,  INTPI1,  and 
INTPI2  generated  by  two  different  meshes;  one  that  uses  a  constant  increment  and  another  that 
focuses  more  attention  (a  finer  mesh)  closer  to  zero.  Specifically,  the  first  mesh  D\  (corresponding  to 
INTul  and  INTPI1),  is  varied  from  [-5,5]  in  steps  of  0.5  whereas  the  second  mesh,  D 2  (corresponding 
to  INTu2  and  INTPI2),  takes  the  values  x  G  {—5, —3.5, —2.5,  2.5,  3.5, 5}  U  {  —  1  :  0.25  :  1},  where 
{a  :  b  :  c}  indicates  incrementing  from  a  to  c  in  steps  of  b.  For  interpolating  the  control,  u(x),  we 
vary  both  x\  and  X2  over  the  values  in  each  mesh  and  for  II(x)  we  only  vary  x±  since  X2  does  not 
appear  in  A(x). 

As  mentioned  in  Section  3.1,  there  are  limitations  when  using  the  Taylor  series  method  for 
approximating  the  SDRE  outside  of  a  small  interval  about  the  origin.  For  instance,  in  [7],  the 
approximate  SDRE  control  fails  for  the  system  (26),  (27)  with  initial  conditions  xo  =  (2,0)r  and 
.To  =  (4,0)T.  This  occurs  when  five  terms  of  the  Taylor  series  method  are  used  to  approximate 
II(x).  The  instability  can  be  examined  more  completely  if  we  consider  the  maximum  real  part  of 
the  eigenvalues  of  the  closed  loop  matrix  A{x)  —  BI\(x )  over  an  interval  of  x\.  If  the  maximum 
real  part  of  these  eigenvalues  crosses  the  y-axis,  the  estimation  for  II(x)  is  not  only  inaccurate 
but  also  causes  the  system  to  become  unstable.  Figure  4(a)  depicts  the  maximum  real  part  of 
the  eigenvalues  of  A(x)  —  BK{ x)  over  x\  G  {—5  :  0.2  :  5}  when  Taylor  series  of  different  lengths 
were  used  to  synthesize  the  control.  It  can  be  seen  that  the  real  part  of  the  eigenvalue  for  the 
Taylor  series  approximations  are  close  to  the  exact  solution  locally.  We  see  that  if  the  2-term  and 
8-term  expansions  are  used  for  the  approximations  to  II(x),  then  the  closed  loop  system  will  have 
eigenvalues  with  negative  real  parts.  The  other  expansions  displayed  in  the  plot  produce  unstable 
behavior  outside  a  small  neighborhood  of  the  origin.  In  fact,  the  expansion  with  five  terms  has 
eigenvalues  with  negative  real  parts  in  the  smallest  interval  of  the  six  presented.  Therefore,  it  is 
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Figure  4:  Maximum  real  part  of  the  eigenvalues  of  the  closed  loop  control  matrix  for  (a)  Taylor 
series  method  and  (b)  II(x)  interpolation  grids  INTPI1  and  INTPI2. 


not  a  surprise  that  there  were  difficulties  using  the  control  for  the  system  with  the  initial  condition 
(2,0)t  and  (4,  0)T.  We  see  from  this  example  that  using  more  terms  in  the  Taylor  series  will  not 
necessarily  result  in  more  accuracy.  For  instance,  we  observe  that  the  use  of  the  8-term  expansion 
in  control  formulation  results  in  a  closed  loop  system  that  remains  stable  for  the  entire  interval 
whereas  the  10-term  series  results  in  an  unstable  closed  loop  system  outside  of  a  small  region 
of  the  origin.  For  the  sake  of  comparison,  we  provide  the  plot  for  the  maximum  real  part  of  the 
eigenvalues  of  the  closed  loop  control  for  the  interpolation  method  (over  II(.t))  in  Figure  4(b).  Here, 
the  maximum  real  part  of  the  eigenvalues  is  close  to  exact  for  both  cubic  and  linear  interpolation 
for  the  entire  interval. 

Next,  to  study  the  accuracy  of  all  of  the  numerical  approaches,  we  use  the  approximate  feedback 
controls  to  stabilize  the  system  from  a  number  of  initial  conditions.  Since  the  2-term  and  8-term 
Taylor  series  produce  the  only  stable  closed  loop  systems,  we  present  the  results  for  the  controls 
generated  with  these  two  series.  We  use  initial  conditions  close  to  the  origin  and  relatively  far 
from  the  origin,  so  as  to  compare  the  Taylor  series  method  to  the  interpolation  method  in  both 
situations.  These  initial  conditions  are  (1,0)T  and  (3,3)r,  respectively. 

Table  3.3  lists  the  floating  point  operation  count  (FLOPS)  needed  at  each  time  step  to  compute 
the  control,  the  2-norm  of  the  difference  between  the  numerical  method  and  the  exact  solution,  and 
the  value  of  (26)  for  both  initial  conditions.  Since  INTu2  does  not  have  a  uniform  mesh,  it  requires 
more  FLOPS  to  use  interpolation.  As  can  be  seen  in  the  table,  one  2-d  cubic  interpolation  for  u(x) 
is  more  costly  than  the  three  1-d  cubic  interpolations  for  the  n(x)  grid  and  the  control  formulated 
using  the  n(x)  grid  is  also  the  most  accurate  for  both  types  of  interpolation.  Figures  5(a),  5(b), 
5(c),  and  5(d)  represent  the  states  x\  and  X2,  the  control,  and  the  cost  functional  integrand  for 
this  example  with  the  initial  condition  x$  =  (1,0)^.  The  solutions  presented  graphically  are  the 
2-term  Taylor  series  and  control  interpolation  over  INTul  and  INTu2  using  linear  interpolation. 
The  2-term  Taylor  series  is  presented  because  it  has  lower  cost  and  is  the  more  accurate  of  the 
two  stable  Taylor  series  representations.  The  specific  interpolation  methods  are  displayed  because 
they  are  the  worst  performing  of  the  interpolation  method  and  still  relatively  accurate  and  low  cost 
compared  to  the  Taylor  series  approximation.  The  second  interpolation  grid  (INTu2),  as  mentioned, 
was  created  for  better  performance  near  the  origin.  This  is  reflected  in  the  figures  since  the  control 
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Method 

FLOPS 

Dl 

Cl 

D2 

C2 

Exact 

22 

0.000 

1.967 

0.000 

7.677  x  102 

2-term  TS 

34 

2.933  x  10"1 

1.833 

9.872 

1.231  x  103 

8-term  TS 

100 

1.105  x  10”1 

2.246 

4.722 

7.717  x  1014 

INTul:  Linear 

39 

2.466  x  10”1 

2.009 

3.637  x  10”1 

7.688  x  102 

INTul:  Cubic 

475 

3.623  x  nr- 

1.964 

2.492  x  10~2 

7.676  x  102 

INTPI1:  Linear 

69 

9.062  x  10~2 

1.980 

1.047  x  10-1 

7.682  x  102 

INTPI1:  Cubic 

162 

5.334  x  10“3 

1.967 

5.909  x  10~3 

7.677  x  102 

INTu2:  Linear 

121 

6.564  x  10-2 

1.976 

2.112 

8.401  x  102 

INTu2:  Cubic 

6935 

1.556  x  10-3 

1.966 

3.541  x  10~2 

7.681  x  102 

INTPI2:  Linear 

138 

2.420  x  10”2 

1.970 

7.587  x  10”1 

7.914  x  102 

INTPI2:  Cubic 

2280 

1.011  x  10"4 

1.967 

1.628  x  10”2 

7.683  x  102 

Table  1:  Comparison  of  Numerical  Methods,  FLOPS  -  the  floating  point  operations  needed  to 
compute  the  control  for  one  time  step,  the  2-nornr  of  the  difference  of  the  approximate  state 
trajectory  and  the  exact  trajectory  for  (Dl)  xq  =  (1,0)T  and  (D2)  .To  =  (3,3)T,  and  the  value  of 
the  cost  functional  integrand  for  (Cl)  tq  =  (1,0)T  and  (C2)  tq  =  (3,3)T. 


and  subsequent  dynamics  are  closer  to  those  generated  by  the  exact  solution  than  those  generated 
by  the  uniform  grid  (INTul).  The  Taylor  series  approach  also  produces  a  control  that  is  similar  to 
the  exact  for  this  relatively  local  initial  condition. 

The  second  initial  condition,  (3,3)r  is  further  from  the  origin.  However,  all  of  the  methods 
produce  stabilizing  controls.  Figures  6(a),  6(b),  6(c),  and  6(d)  represent  the  states  x\  and  X2,  the 
control,  and  the  cost  integrand  for  this  initial  condition.  The  Taylor  series  method  depicted  (as  well 
as  the  8-term,  not  depicted)  produces  an  approximate  feedback  control  that  is  far  from  accurate. 
The  interpolation  method  produces  relatively  accurate  controls  without  much  computational  effort. 


3.4  5D  Example 


To  illustrate  a  case  when  interpolating  over  n(x)  or  K(x)  is  more  beneficial,  we  consider  the 
following  example  from  [31].  The  system  is 


X2 

'0 

o' 

X3 

0 

0 

™3 

X4 

+ 

1 

0 

O  O 

*^5  '  *^4 

0 

0 

0 

_o 

i_ 

(33) 


Associated  with  these  dynamics,  we  choose  the  quadratic  cost  functional  weighting  matrices  in  (1) 
to  be  Q  =  10/5x5  and  R  =  0. 1/2x2- 

Here,  we  use  an  obvious  SDC  parameterization 


A{x) 


0  1  0  0  O' 

0  0  10  0 
0  0  0  x2  0 

— xi  0  0  x2  1 

0  0  0  0  0 
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(a) 


(b) 


(c) 


(d) 


Figure  5:  Comparison  of  Taylor  series  and  interpolation  methods  for  2D  example,  closed  loop  state 
dynamics  for  (a)  x\  and  (b)  X2,  (c)  control  u,  and  (d)  cost  functional  integrand  over  the  time  span 
0  <  t  <  10  with  xq  =  (1,  0)T. 


SDRE  BASED  FEEDBACK  CONTROL,  ESTIMATION,  AND  COMPENSATION 


20 


(a)  (b) 


(c) 


(d) 


Figure  6:  Comparison  of  Taylor  series  and  interpolation  methods  for  2D  example,  closed  loop  state 
dynamics  for  (a)  x\  and  (b)  X2,  (c)  control  u,  and  (d)  cost  functional  integrand  from  0  <  t  <  10 
with  xq  =  (3,  3)t. 
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Method 

FLOPS 

INTPI:  Linear 

2028 

INTK:  Linear 

1292 

INTu:  Linear 

2202 

Table  2:  Comparison  of  interpolation  methods  for  5D  example. 


The  first  n  =  5  columns  of  the  state-dependent  controllability  matrix  are 


M(x) 


0  0  0  0  1 
0  0  10  0 
1  0  0  0  0 
0  0  0  1  0 
0  10  0  0 


which  has  rank  n.  There  are  only  two  states  present  in  the  A(x)  matrix  so  that  H  = 

Hence,  when  creating  the  n(x)  and  K{x)  grids  (INTPI  and  INTK,  respectively),  we  need  only  vary 
over  those  two  particular  states.  Both  of  the  grids  use  the  mesh 


Du  =  {xi  =  -2  :  0.5  :  2}  x  {x4  =  -2  :  0.2  :  2}. 


When  creating  the  grid  for  interpolation  over  u,  we  must  create  a  grid  in  five  dimensions.  This 
task  is  accomplished  by  crossing  the  Du  mesh  with  Dj  =  {xj  =  —2  :  1  :  2},  j  =  2,3,5.  There¬ 
fore,  the  offline  calculations  take  more  time  and  computational  effort  to  produce  the  grid.  The 
control  is  formulated  at  each  time  step  by  either  two  5D  linear  interpolations  for  u,  ten  2D  linear 
interpolations  for  K[x)  or  fifteen  2D  linear  interpolations  for  n(x).  Because  of  the  complexity  of 
5D  interpolations,  interpolation  over  the  u  grid  is  not  only  less  accurate  but  more  computationally 
taxing.  Figures  7(a)-(d)  represent  states  x'2,  X4,  X5,  and  control  1x2  respectively.  The  number  of 
FLOPS  that  it  took  to  formulate  the  control  at  each  time  step  for  the  different  methods  is  given 
in  Table  2.  Here,  interpolation  over  the  gain  matrix  K(x)  is  clearly  the  most  accurate  and  least 
computationally  taxing.  Note  that  in  each  of  the  following  two  examples  the  “exact”  solution  is 
obtained  by  solving  the  SDRE  evaluated  at  the  state  (so  that  it  is  an  ARE)  at  each  time  increment 
of  the  integration. 


3.5  3D  Example 


The  last  example  we  present  is  referred  to  as  the  toy  example  [13].  We  present  this  example  because 
it  exhibits  complexities  not  yet  encountered  in  the  previous  examples.  First,  it  has  nonconstant 
weighting  matrices  for  the  state.  Specifically,  we  seek  to  minimize  the  cost  functional 


/*oo 

J(xq,u)=  /  xtQ(x)x  +  uT  Rudt 

Jo 


where 


Q(x)  =  0.01 


0  0  0 

0  eXl  1 
0  1  e~Xl 


xi  =  fi(x) 

X2  =  e~XlX3  +  e~Xl/2u  , 
X3  =  —  eXlX2  +  eXlJ2u 


with  respect  to  the  state  dynamics 
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(a)  (b) 


(c) 


(d) 


Figure  7:  Comparison  of  interpolation  methods  for  5D  example,  closed  loop  state  dynamics 
for  (a)  X2,  (b)  24,  and  (c)  25,  and  (d)  control  112  over  the  time  span  0  <  t  <  10  with 
®o=(0.4,  -0.2,  0.1,  -0.1,  0.5)t. 
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(a) 


(b) 


Figure  8:  c\  with  fc  step  size  (a)  0.0001  and  (b)  0.01. 


where 

fi{x)  =  X[_3,3](^i)sgn(eXlx’|  -  e~Xlxl). 

Second,  since  the  dynamics  are  not  continuous  we  must  use  an  approximate  continuous  parameter¬ 
ization  to  formulate  the  SDRE  control.  As  detailed  in  [13],  the  x\  dynamics  used  to  formulate  the 
control  are  approximated  and  stabilized  using 


x\  «  satsin(/c)  —  x±, 


where 


sin  (§£)  if  |0|  <  5 
satsin(0)  =  ^  1  if  9  >  5 

-1  if  9  <  - 5 


f  —  PX i  2  _  -x  1  2 

jc  —  c  x2  e  x3, 


and  5  =  0.001.  Then,  defining 


ci  = 


_  satsin(/e) 


fc 


,  C2  = 


exl  —  1 

X\  5 


and  c3  = 


XI 


we  can  rewrite  the  dynamics  used  to  formulate  the  control  as 

X\  ~  Cl(c2Xl  +  l)x|  -  C1(c3X1  +  l)x%  -  X\ 
%2  =  (C3X1  +  l)x3  +  e~Xl^2u 

x3  =  —(C2XI  +  1)X2  +  eXl/2 


This  introduces  the  last  complexity  that  is  of  importance.  The  limits  of  ci,  c2,  and  c3  as  fc  and 
x\  go  to  zero  exist  (and  are  2/(6tt),  1,  and  —1  respectively)  and,  hence,  each  are  well  behaved. 
However,  as  we  increase  the  increment  of  fc  from  0.0001  to  0.01,  it  decreases  the  appearance  of 
continuity  in  ci  (see  Figure  8).  Therefore,  the  grid  step  size  must  be  kept  reasonably  small  to  keep 
the  closed  loop  system  stable.  This  is  also  problematic  for  time  integration,  even  using  the  exact 
solution.  As  can  be  seen  in  [13],  there  is  some  discontinuous  behavior  depicted  in  the  plots  of  the 
control.  In  fact,  we  found  that  we  had  to  increase  the  tolerance  (of  the  Matlab  routine  odel5s)  in 
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Method 

FLOPS 

INTPI:  Linear 

1413 

INTK:  Linear 

696 

INTu:  Linear 

246 

Table  3:  Comparison  of  interpolation  methods  for  3D  example. 


order  to  integrate  the  system  with  the  exact  solution  (the  SDRE  being  solved  at  each  time  step) 
for  some  initial  conditions. 

By  (4),  we  can  parameterize  these  dynamics  in  many  ways.  In  [13]  (here  we  correct  a  mistake 
made  therein),  the  terms  with  more  than  one  multiple  of  the  state  are  represented  as 


C\C2X\x\ 

ClC3Xixl 

C3XIX3 

C2XIX2 


ai(ciC2xl)xi  +  (1  -  ai)(cic2xix2)x2, 
a2(c1c3xl)x1  +  (1  -  a2)(cic3xix3)x3, 
a3(c3xi)x3  +  (1  -  a3)(c3rr3)xi, 
a4(c2X2)xi  +  (1  -  a4)(c2.Ti)x2. 


We  choose  ot\  =  a2  =  0  and  a3  =  a\  =  1  resulting  in  the  parameterization 


A(x) 


-1  C1C2X1X2  +  C4X2  -C1C3X1X3  -  C1X3 
C3X3  0  1 

-C2X2  -1  0 


(34) 


and 


B(x) 


0 

e-a;i/2 

ex\j2 


The  grids  that  we  use  result  from  varying  x\,  x2  and  x3  over  the  mesh  D  =  {—3.5  :  0.2  :  3.5}.  As 
depicted  in  Figure  9,  interpolation  over  II(x),  K(x),  and  u(x)  all  stabilize  the  system.  The  FLOPS 
needed  to  compute  the  solution  for  each  method  are  given  in  Table  3.  Since  u(x)  only  requires 
one  interpolation,  it  is  evident  that  interpolation  over  this  grid  is  the  most  efficient  method  to 
approximate  the  control. 


4  Nonlinear  Estimation 

We  now  extend  the  SDRE  theory  for  state  feedback  to  state  estimation.  This  is  of  particular 
interest  for  nonlinear  systems  with  state-dependent  output  coefficients.  Specifically,  for  systems 
that  can  be  put  in  the  form 

x  =  f(x)  =  A(x)x,  (35) 

where  x  G  and  f(x)  is  nonlinear  with  output  given  by 

y  =  C\x)x,  (36) 

where  C(x)  is  a  continuous  m  x  n  matrix- valued  function.  From  [14],  we  see  that  a  system  of  the 
form 

xe  =  A(xe)xe  +  ^(xg,  (C(x)x  -  C(xe)xe))  (37) 

is  a  state  estimator  for  this  system  if  the  error  is  asymptotically  stable  at  the  zero  equilibrium. 
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Figure  9:  Comparison  of  interpolation  methods  for  3D  example,  closed  loop  state  dynamics  for  (a) 
x\,  (b)  X2,  and  (c)  23,  and  (d)  control  u  over  the  time  span  0  <  t  <  20  with  xo=(0,2,3)T. 
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We  can  now  formulate  the  proposed  nonlinear  estimator  by  mimicking  the  theory  of  estimators 
for  linear  systems.  If  the  system  were  linear,  it  is  well  known  that  the  optimal  observer  is  given  by 

*(z,  0 Cx  -  Cz ))  =  L(Cx  -  Cz), 

where  L  =  TCTV~l  and  T  solves  the  ARE  for  the  dual  system.  Likewise,  under  certain  assumptions, 
we  obtain  a  suboptimal  observer  for  the  nonlinear  system  by  exploiting  the  dual  system.  Hence,  to 
find  the  suboptimal  observer,  we  consider  the  cost  functional 

1  f°° 

J(x,u)  =  -  /  xTUx  +  uTVudt,  (38) 

2  Jo 

with  associated  state  dynamics  given  by 

x  =  At(x)x  +  CT{x)u, 


where  U  6  Hnxn  is  SPSD  and  V  €  is  SPD.  Similar  to  the  linear  theory,  we  choose 

^f(xe,  (C(x)x  -  C(xe)xe))  =  L(xe)(C(x)x  -  C(xe)xe), 


where 

L{xe)  =  T(xe)CT(xe)V -1 


and  r(xe)  solves  the  dual  SDRE, 

r (xe)AT(xe)  +  A(xe)T(xe)  -  T(xe)CT (xe)V^1C(xe)T(xe)  +  U  =  0. 
The  system  with  the  proposed  observer  is  given  by 


(39) 

(40) 


A(x)x, 

(41) 

A(xe)xe  +  L(xe)(y  -  C(xe)xe), 

(42) 

C(x)x. 

(43) 

(44) 

It  remains  to  be  shown  that  the  error  for  the  system  is  asymptotically  stable  to  zero.  To  develop 
the  theory,  we  require  the  following  remarks  and  observations: 

Remark  4.1.  The  observer  based  SDRE  has  a  solution  at  each  x  in  the  state  domain,  including 
x  =  0. 

Define  Aq  =  A(0)  and  Cq  =  (7(0).  Then,  the  solution  at  x  =  0,  To,  is  the  solution  to  the  ARE 

r<X  +  a0t0  -  r0c'oT v^CoTq  +  u  =  o.  (45) 

Therefore,  each  of  the  state-dependent  matrices  can  be  represented  by  the  maps 

A{x)  =  A0  +  A  A(x), 

C(x)  =  C0  +  AC(x), 


and 


r(s)  =  r0  +  Ar(x). 
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It  follows  directly  that 

L(x)  =  L0  +  A  L(x), 

where 

L0  =  T0C^V~1  (46) 

and 

A  L{x)  =  {AT{x)Ct(x)  +ToACt{x))V~1. 

Here,  Lq  is  the  gain  matrix  that  stabilizes  (Aq,Cq).  Consequently,  Ho  —  LqCq  has  all  eigenvalues 
with  negative  real  parts.  Also,  since  AH(x),  AC(s),  and  Ar(x)  are  zero  at  x  =  0  and  continuous, 
they  are  small  and  bounded  in  a  neighborhood  of  the  origin.  As  a  result,  AL(0)  =  0  and  A L{x)  is 
small  in  a  neighborhood  of  the  origin. 

We  are  now  able  to  prove  the  following  theorem: 

Theorem  4.1.  Assume  that  f(x)  and  (j  =  1  ,...,n)  are  continuous  in  x  for  all  ||x||  <  r, 
f  >  0  and  that  x  =  0  is  a  stable  equilibrium  point  of  (41).  Additionally  assume  that  f(x)  and 
y(x)  can  be  put  into  SDC  form  such  that  f(x)  =  H(x)x  and  y(x)  =  C(x)x  where  (H(x),C(x))  is  a 
detectable  parameterization  and  A(x)  and  C(x)  are  locally  Lips chitz  for  all  x  &  II  C  Br( 0),  where 
Ll  is  a  nonempty  neighborhood  of  the  origin.  Then  the  estimated  state  given  by 

xe  =  H(xe)xe  +  L(xe)(C(x)x  -  C(xe)xe),  (47) 

( where  L(x)  is  given  by  (39))  will  converge  locally  asymptotically  to  the  state. 

Proof.  Let  r  >  0  be  the  largest  radius  such  that  Br( 0)  C  Q.  As  in  the  linear  case,  we  consider  the 
error  between  the  actual  state  and  the  estimated  state  given  by 

e(t)  =  x(t)  -  xe(t). 


The  error  must  satisfy  the  differential  equation 


e(t)  =  x(t)  -  xe(t), 

with  the  initial  condition  eo  =  e(0)  =  xo  —  xeo,  where  xq  =  x(0)  and  xeo  =  xe(0).  Substituting  the 
state  (35)  and  state  estimator  dynamics  (47)  into  the  error  dynamics  yields, 

e  =  A(x)x  —  H(xe)xe  —  L(xe)(C(x)x  —  C(xe)xe) 

=  (Ho  —  LoCo)e  +  AH(x)x  —  AH(xe)xe 

—  L(xe)(A(7(x)x  —  AC'(Xe)Xe)  —  AL(xe)C'oe. 


We  set 


g(x,xe,e)  =  AH(x)x  —  AH(xe)xe 

—  L(xe)(AC(x)x  —  A  C'(Xe)Xe) 
-  AL(xe)Coe. 


By  the  hypotheses  that  H(x)  is  locally  Lipschitz  on  H,  there  exists  a  ka  >  0  such  that 


||H(x)  -  H(xe) |j  <  ka \\e 


(48) 


This  Lipschitz  condition  extends  to  AH(x)  =  H(x)  —  Hq  so  that 
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This,  in  turn,  implies  that 

||Ayl(x)x  —  AA(xe)xe\\  =  ||A  A(x)x  —  AA(xe)x 

+  AA(xe)x  —  AA(xe)xe\\ 

<  (ka\\x\\  +  ||AA(xe)||)  ||e||. 

By  the  conditions  placed  on  C(x)  there  exists  a  Lipschitz  constant,  kc ,  such  that  (upon  the  same 
manipulation  as  above) 

|| A C(x)x  -  AC(xe)xe\\  <  (kc,||*||  +  ||  A(7(xe) || )  ||e||. 

For  notational  brevity,  we  set 

h{x,xe)  =  (rc^lMI  +  ||  Aj4(xe)  ||  +  /cc||x||||L(xe)|| 

+  l|AC,(xe)||||L(xe)||  +||AL(xe)||||C'0||) 

and  conclude  that 


\\g(x,xe,e)\\  <  h(x,xe)\\e 


(49) 


By  construction  of  the  incremental  matrices, 


lim  h(x,  xe)  =  0. 

x,xe—>0 

Due  to  the  detectability  condition  at  the  origin  and  the  use  of  SDRE  (and  due  to  construction, 
LQR)  techniques  to  find  the  gain,  Lq,  there  exists  (3  >  0  and  G  >  0  such  that 

||exp(^0  -  L0Cb)||  <  Gexp(-/3f). 


Then,  given  r/  G  (0 ,(3/G)  let  e  G  (0,  r)  be  such  that  h(z,z )  <  r/  for  all  z,z  G  Be( 0)  C  D.  The 
equilibrium  x  =  0  is  stable  implying  that  there  exists  a  6  G  (0,  e/2]  such  that  ||x(f)||  <  e/2  for  all 
time  so  long  as  ||xo||  <  5.  Let  xo  and  eo  be  such  xo  G  B§( 0)  C  D  and  eo  G  0),  where  e  =  e/(2 G) 
and  G  =  max{G,  1}.  Then  the  error  dynamics  have  a  local  solution  and  are  still  contained  in  Br( 0), 
possibly  on  only  a  small  interval  [0,  t),  and  so  long  as  the  solution  exists  it  can  be  expressed  by  the 
variations  of  constants  formula 


e{t) 


exp((A0— T0Cl0)f)e(0) 

+  /  exp((^0  -  L0C0)(t 

J  o 


s))9Ws)^e(s),eW)  ds- 


(50) 


By  continuity  there  exists  some  finite  time  t  G  ((),  t\  such  that  e(t)  is  in  the  ball  Be/2( 0).  Then, 
upon  taking  the  norm  of  both  sides  of  (50),  we  find  that  for  all  t  G  [0,  t) 


<  Gexp(— /3t) ||e(0) ||  +  Giq  /  exp(— /?(f  -  s))||e(s)||  ds. 


(51) 


We  now  multiply  by  exp  (fit)  and  apply  the  Gronwall  inequality  so  that,  for  t  G  [0,  i), 


exp(/3t)  \\e(t)  ||  <  G||e0||exp(Gr/t) 


(52) 
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and  it  follows  that 

||e(t)||  <  G||e0||  exp((G?/  -  (3)t).  (53) 

By  the  choice  we  have  made  for  eo  then 

l|e(t)||  <  |exp((Gty -/?)*).  (54) 

But  the  bound  7]  holds  only  as  long  as  xe(t)  remains  in  Be( 0).  However,  because  ||xe(f)||  —  ||x(t)|j  < 

l|e(t)||, 

(MOII  <  |  +  |exp((Gr/ -/?)£).  (55) 

Hence,  by  7]  <  (5/G,  xe  stays  in  Be( 0)  for  all  time  and  the  state  estimator  converges  exponentially 
to  the  state.  □ 


4.1  State  Estimator  Example  One 

This  example  is  a  simple  demonstration  of  the  SDRE  technique  for  state  estimation.  The  system 
that  we  wish  to  observe  (from  [29])  is  given  by 


X\  =  X2xf  +  X2, 
X 2  =  -x\  ~  X\, 

y  =  xi. 

In  SDC  form,  the  above  system  can  be  rewritten  as 


x 

y 


0  1  +  x\ 

_-(l  +  xf)  0 

[l  0]  x. 


The  state-dependent  observability  matrix  is 


O(x) 


1  0 

0  1  +  x\ 


Since  0(x)  has  full  rank  throughout  3?2,  the  system  is  observable. 

We  set  the  weighting  matrices  in  (38)  to  U  =  10 1  and  V  =  0.01.  We  use  the  interpolation 
method  to  approximate  T(x)  where  the  estimator  mesh  is  Mx  =  {—2  :  0.1  :  2}.  We  use  an 
initial  condition  of  xq  =  (1,1)T  for  the  state  and  xeo  =  (0.5,0.5)T  for  the  estimated  state.  As  a 
comparison,  we  also  display  the  state  estimator  found  in  [29]  given  by 

z\  =  — 5zi  +  z2  +  5y  +  z\z\ 

Z2  =  — H^i  -  zf  +  10 y 


which  we  refer  to  as  the  Thau  estimate  (for  this  system).  Figures  10(a)  and  (b)  exhibit  the  fast 
convergence  of  each  state  estimate  to  x\  and  X2 ■  Figure  10(c)  is  a  state-space  representation  of  the 
state  estimator  over  time,  included  as  comparison  for  the  plot  given  in  [29].  Figure  10(d)  represents 
the  norm  of  the  difference  between  the  actual  state  and  estimated  state.  We  see  that  the  error 
decreases  gradually  for  the  SDRE  technique  while  the  error  for  the  Thau  state  estimator  is  erratic. 
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(a) 


(b) 


-  Thau  Error 
SDRE  Error 
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Time  (sec) 


(c) 
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Figure  10:  Plots  for  Example  4.1,  (a)  state  x\  with  the  Thau  and  SDRE  estimates  of  x\,  (b)  state 
X2  with  the  Thau  and  SDRE  estimates  of  X2,  (c)  state-space  representations  x-2  vs.  x\  for  actual 
state  and  state  estimators,  and  (d)  error  of  the  Thau  and  SDRE  state  estimators  over  the  time 
span  0  <  t  <  10  with  state  initial  condition  xq  =  (1, 1)T  and  state  estimator  initial  condition 
Xe0  =  (0.5,  0.5)t. 
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4.2  State  Estimator  Example  Two:  Unstable  Zero  Equilibrium 


We  now  present  an  example  that  shows  the  effectiveness  of  the  estimator  synthesized  through  the 
use  of  SDRE  techniques  even  when  the  hypotheses  of  the  theorem  are  not  satisfied.  Consider  the 
system 


xi  =  x2, 

x2  =  -x2\x2\, 


(56) 


with  measurement 


y  =  xi. 


Lyapunov  stability  theory  states  that  if  there  exists  a  Lyapunov  function,  V ,  such  that  in  every 
neighborhood  of  the  origin  V  and  V  have  the  same  sign,  then  the  origin  is  unstable.  Using  the 
Lyapunov  function  V(x)  =  ||a;|||  we  see  that 


V(x)  =  X\X\  +  x2x2 
=  X 1  x2  -  x%\x2 


If  we  let  e  €  (0,1)  and  consider  the  vector  x  =  (e/2,e/2)r  G  Be( 0).  Then,  V(x)  >0  and  since 

e2  <  e, 


V(x) 


>  0. 


Therefore,  zero  is  unstable  for  the  given  system  and  this  violates  the  hypotheses  of  the  theorem. 

However,  numerically  we  shall  show  that  the  state  estimator  constructed  with  SDRE  techniques 
works  very  well.  As  a  comparison,  we  consider  the  following  second  order  estimator  from  [29]  (we 
will  again  refer  to  this  as  the  Thau  state  estimator): 


i  i  =  — 20zi  +  z2  +  lOy, 
z2  =  — 100zi  -  z2\z2\  +  lOOy. 


We  can  rewrite  (56)  in  the  SDC  form  f(x)  =  A{x)x  where 


A{x) 


0  1 

0  \x2 


Since 

C=[  1  0]  , 

the  state-dependent  observability  matrix  is  O(x)  =  I ,  so  we  have  that  this  system  is  observable  for 
all  x.  For  the  weighting  matrices  in  (38),  we  use  U  =  50/  and  V  =  0.1.  To  approximate  the  solution 
to  the  SDRE,  we  use  the  interpolation  method  for  n(x)  and  the  mesh  used  is  Mx  =  {—5  :  0.2  :  5}. 
We  set  the  initial  condition  of  the  state  to  xq  =  (1,1)T  and  the  initial  condition  of  each  state 
estimator  to  xeo  =  (0,  0)T.  Figure  11  depicts  the  behavior  of  the  Thau  and  SDRE  state  estimators 
over  a  five  second  time  span.  Figure  11(a)  exhibits  the  fast  convergence  of  each  state  estimator  to 
x\,  where  Figure  11(b)  conveys  that  each  state  estimator  takes  more  time  to  converge  to  x2.  The 
Thau  state  estimator  is  erratic  at  first  and  becomes  closer  to  the  actual  state  quicker,  whereas  the 
SDRE  estimator  converges  slower  but  does  not  stray  as  far  initially.  Figure  11(c)  is  a  state-space 
representation  of  the  state  estimator  over  time,  included  as  comparison  for  the  plot  given  in  [29]. 
Figure  11(d)  represents  the  norm  of  the  difference  of  the  actual  state  and  estimated  state.  We 
see  that  the  error  decreases  gradually  for  the  SDRE  technique  while  the  error  for  the  Thau  state 
estimator  increases  initially  and  then  decreases  abruptly. 
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Figure  11:  Plots  for  Example  4.2,  (a)  state  x\  with  the  Thau  and  SDRE  estimates  of  x\,  (b)  state  X2 
with  the  Thau  and  SDRE  estimates  of  X2 ,  (c)  state-space  representations  X2  vs.  x\  for  actual  state 
and  state  estimators,  and  (d)  norm  of  the  error,  || e(t )  1 1 2 ,  for  the  Thau  and  SDRE  state  estimators 
over  the  time  span  0  <  t  <  5  with  state  initial  condition  xo  =  (1,  L)T  and  state  estimator  initial 
condition  xeo  =  (0,0)T. 
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5  Compensation  Using  the  SDRE  State  Estimator 


In  this  section,  we  investigate  the  use  of  the  SDRE  state  estimator  in  state  feedback  control  laws 
to  compensate  given  nonlinear  systems.  We  consider  systems  that  can  be  put  in  the  SDC  form 

x  =  A(x)x  +  B(x)u, 
y  =  C(x)x, 

where  A(x)  is  a  continuous  n  x  n  matrix- valued  function,  B(x)  is  a  continuous  n  x  m  matrix¬ 
valued  function  and  C(x)  is  a  continuous  p  x  n  matrix- valued  function.  We  denote  II(x)  as  the 
solution  to  (12),  where  Q  and  R  are  the  matrices  from  the  cost  functional  (1)  and  T(x)  as  the 

solution  to  (40),  where  U  and  V  are  the  matrices  from  the  cost  functional  (38).  If  we  again  let 

L(x)  =  T(x)Ct(x)V~1  and  K(x)  =  R~1BT(x)H(x),  the  control  using  estimator  compensation  can 
be  formulated  as 

x  =  A(x)x  —  B(x)K  (xe)xe, 

xe  =  A(xe)xe  -  B(xe)K(xe)xe  +  L(xe)(y  -  C(xe)xe),  (57) 

y  =  C(x)x. 


To  show  that  the  compensated  system  converges  asymptotically  to  zero  in  a  neighborhood  about 
the  origin,  we  require  the  following  remarks: 


Remark  5.1.  The  block  matrix 


A 

O' 

+ 

r° 

B 

0 

0 

[o 

0 

'0 

ol 

'0 

ol 

+ 

C 

0 

+ 

0 

D 

=  P||  +  ||R||  +  ||C'||  +  ||D 


Remark  5.2.  The  matrix 


H(x) 


A{x)  —  B(x)K(x)  0 

0  A(x)  —  L(x)C(x) 


(58) 


has  all  eigenvalues  with  negative  real  part  for  all  x  such  that  the  state- dependent  controllability  and 
observability  matrices  have  full  rank.  This  follows  directly  from  the  eigenvalue  separation  property 
[1],  since  at  each  x,  H(x)  is  a  constant  block  diagonal  matrix  and  the  blocks  each  have  eigenvalues 
with  real  part  negative. 


We  are  now  able  to  prove  the  following  theorem  for  the  compensated  system: 
Theorem  5.1.  Assume  that  the  system 


x  =  f(x)  +  B(x)u 

is  such  that  f(x)  and  (j  =  1  ,...,n)  are  continuous  in  x  for  all  ||x||  <  f,  f  >  0,  and  that 
f(x)  can  be  written  as  f(x)  =  A(x)x  (in  SDC  form).  Assume  further  that  A(x)  and  B(x)  are 
continuous.  If  A(x),  B(x),  and  C(x)  are  chosen  such  that  the  pair  (A(x),C(x))  is  detectable  and 
(A(x),  B(x))  is  stabilizable  for  all  x  6  P  C  B?( 0)  (where  D  is  a  nonempty  neighborhood  of  the 
origin),  then  (x,e)  =  (0,0)  for  system  (57)  is  locally  asymptotically  stable.  Here,  e  =  x  —  xe  is  the 
error  between  the  state  and  the  state  estimate  in  (57). 
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Proof.  Let  r  >  0  be  the  largest  radius  such  that  Br( 0)  C  0.  Using  the  mapping  techniques  described 
in  the  preceding  proofs,  one  has 


A(x)  =  A0  +  AA(x), 

B(x)  =  Bq  +  A  B(x), 

C(x)  =  C0  +  AC(x), 

K(x)  =  K0  +  A  K(x), 

L(x)  =  L0  +  A  L(x), 

with  Lq  =  L(0)  and  AL(0)  =  0  (etc.)  for  all  of  the  matrices.  The  error  between  the  actual  state 
and  the  estimated  state  satisfies  the  differential  equation 


e(t)  =  x(t)  -  xe(t), 


with  e(0)  =  x(0)  —  xe(0).  We  substitute  the  state  and  state  estimator  dynamics  (57)  for  x  and  xe 
yielding 

e  =  A(x)x  -  A(xe)xe  -  (B(x)  -  B(xe))K(xe)xe  ,  . 

-L(xe)(C(x)x  -  C(xe)xe) 


As  in  the  theory  of  linear  systems,  the  dynamics  of  both  eft)  and  x(t)  are  of  interest.  Hence,  the 
system  to  be  considered  is 

~x(t) 

_e(t)J’ 


with  x  defined  in  (57)  and  e  given  by  (59).  In  order  to  proceed,  it  is  convenient  to  put  the  system 


into  the  form 


x(ty 

'Hu  Hu 

'x(ty 

1 

Sn(x,xe) 

Su{x,xe) 

x(ty 

e{t)_ 

H-21  H-22 

e{t)_ 

S2i(x,xe) 

S22(x,Xe)_ 

e(t)_ 

(60) 


Rewriting  the  state  dynamics  with  the  given  maps,  we  have  that 


x  =  (Aq  —  BoKo)x  +  A  A(x)x  —  BoAK(xe)xe  —  A  B(x)K(xe)xe. 

To  put  this  in  the  proper  form,  we  add  and  subtract  the  terms  BoAK(xe)x  and  AB(x)K(xe)x 
resulting  in 

x  =  (Ao  —  BqKq)x  +  AA(x)x  —  BoAK(xe)x  +  BoAK(xe)e 
—A  B(x)K(xe)x  +  A  B(x)K(xe)e. 

Thus, 

Hu  =  Ao  —  BqKq, 

Sn(x,xe)  =  AA(x)  —  BoAK(xe)  —  AB(x)K(xe), 

H\ 2  =  0, 

Si2{x,xe)  =  B0AK(xe)  +  AB(x)K(xe). 

Now  we  seek  to  formulate  the  error  dynamics  in  a  suitable  manner.  Rewriting  the  error  dynamics 
with  the  given  mappings  yields 

e  =  (Ao  —  LoCo)e  +  A  A(x)x  —  A  A(xe)xe 

—(A  B(x)  —  AB(xe))K(xe)xe  —  AL(xe)(AC(x)x  —  A  C(xe)xe) 

—AL(xe)Coe  —  Lq(AC(x)x  —  A  C(xe)xe). 
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We  eliminate  all  of  terms  with  xe  multiples  by  adding  and  subtracting  terms  (if  the  term  consists 
of  a  matrix  times  xe  we  add  and  subtract  that  matrix  times  x).  This  leaves  us  with 

e  =  (Ao  —  LoCo)e  +  AA{x)x  +  A^4(xe)e  —  AA(xe)x 

—  (A  B(x)  —  AB(xe))K(xe)x  +  (A  B(x)  —  A  B(xe))K(xe)e 
—AL(xe)AC(x)x  +  AL(xe)AC(xe)x  —  AL(xe)AC(xe)e 
—AL(xe)Coe  —  LqAC(x)x  +  LoAC(xe)x  —  LoAC(xe)e. 

Thus,  we  have  manipulated  the  error  dynamics  as  desired  and  we  have  that 


H21 

S2 i(x,xe) 


H22 

S22(x,xe) 


0, 

AA(x)  —  A^4(xe)  —  A  B(x)K(xe)  +  AB(xe)K(xe) 
—AL(xe)AC(x)  +  AL(xe)AC(xe )  —  LqAC(x) 
+LoAC(xe), 

Ao  —  LqCq, 

A A(xe)  +  A B(x)K(xe)  -  A B(xe)K(xe) 
—AL(xe)AC(xe)  -  AL(xe)C0  -  L0AC(xe). 


Let 


and 


H  = 


H 11 
H2l 


HV2 

h22 


Sn(x,xe)  S  i2(x,xe) 
S2  l(x,Xe)  S22(x,Xe ) 


By  Remark  5.1,  we  can  bound  the  matrix  norm  by 


\\S(x,  xe)\\  <  ||<S'n(x,  xe)||  +  ||S,i2(x,xe)||  +  ||S2i(x,xe)||  +  ||S22(x,xe 


Then,  taking  the  norm  of  each  of  these  matrices,  we  find 


||<Sii(x,xe)||  <  ||AA(x)H  +  ||L>o||||A/i(xe)||  +  ||AR(x)||  ||RT(xe)||, 
il<S,2i(x,xe)jj  <  i|AA(x)||  +  ||AA(xe)||  +  \\AB{x)\\\\K(xe)\\ 

+  ||AL>(xe)  |j  ||iv  (xe)  ||  +  ||AL(xe)||||AC(x)|| 

+  ||AL(xc)||||AC-(xe)||  +  ||Lo||||AC(x)||  +  ||L0||||AC(xc)||) 
||Si2(x,xe)||  <  ||5o||||AA'(xe)||  +  ||AS(x)||||if(xe)||, 

\\S22(x,xe)\\  <  ||AA(xc)||  +  \\AB(x)\\\\K(xe)\\  +  \\AB(xe)\\\\K(xe)\\ 

+  ||AL(Xe)||||AC(Xe)||  +  ||AL(Xe)||||Co|| 

+  ||L0||||AC(xe)||. 


Therefore, 

||5(x,xe)||  <  2(||AA(x)||  +  ||AA(xe)||)  +  2\\B0\\\\AK(xe)\\ 

+2||/i(xe)||(2||AR(x)||  +  ||AB(xe)||)  +  2||  AL(xe)  ||  ||  AC(xe)  || 
+  ||AL(xe)||||AC(x)||  +  ||Lo||(2||AC(xe)||  +  ||AC(x)||) 

+  ||Co||||AL(xe)|| 

=  g(x,xe). 


By  definition  of  the  incremental  matrices,  as  x,  xe  — >  0,  g(x,  xe)  — >  0.  Thus,  for  any  7]  >  0  there 
exists  an  a  €  (0,  r)  so  that  if  z,  z  G  Ba( 0)  then 


l|S(M)||  <  f?. 
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Using  the  variations  of  constants  formula  with  nonhomogeneous  part 

S(x,xe)  X  , 

we  have  that  the  solution  for  the  system  (so  long  as  it  exists)  is  given  by 


x(t) 

e(f) 


=  exp  (Ht) 


x0 

eo 


+  /  exp (H(t  —  s))S(x,xe) 


x(s) 

e(s) 


ds. 


If  we  assume  that 


x0 

eo 


£  ^a/2(0) 

then  xo  G  Ba/ 2(0)  and  eo  G  Ba/ 2(0)  (which  implies  that  xeo  G  Ba( 0))  so  that 

||*5(x,xe)||  <  rj . 


Then,  for  as  long  as 


the  inequality 


x(f) 

e(f) 


< 


a 
2  ’ 


x(t) 

,e(0. 


<  ||  exp(Ht)  || 


x0 

eo 


exp  (H(t 


x(s) 

,e(s). 


ds 


holds. 


By  the  eigenvalue  separation  principle  (see  Remark  5.2),  there  exists  a  constant  (3  >  0  such  that 


real 


jeigs  f 


Ao  —  B0K0  0 

0  Aq  —  L0C0 


<-f3. 


We  know  there  also  exists  a  G  >  0  such  that 

||  exp(#f)  ||  <  Gexp(— fit). 
Let  t  represent  the  amount  of  time  that 


x(f) 

e(*). 


e  Ba/ 2  (0)  - 


Then,  for  t  G  [0,  t)  the  state  and  error  dynamics  are  bounded  by 


x(t) 

e{t) 


<  Gexp(— /3t) 


x0 

eo 


+ 


Gr]  I  exp (-/3(t-s)) 
Jo 


x(s ) 
e{s\ 


ds. 


Upon  multiplying  both  sides  by  exp  (fit),  we  have  the  relation 


exp  (fit) 


x(t) 

e(t) 


<  G 


x0 

eo 


+ 


exp(/3s) 


x(t) 

e(i) 


ds. 


and  invoking  the  Gronwall  inequality  we  obtain 


x(t) 

,e(*). 


exp  (fit)  < 


x0 

eo 


Gexp(Grjt). 
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Multiplying  through  each  side  by  exp (—/3t),  we  thus  find  that 

[pSl  <  M  Gexp(-(P-Gri)t)  (61) 

[eoJ 

for  all  t  G  [0.  t).  Recall  that  the  bound  q  holds  true  so  long  as  the  trajectories  of  both  x(t)  and 
xe(t)  remain  in  an  a— ball  of  the  origin.  Thus,  we  must  also  consider  the  bound  placed  upon  xe(t). 
Using  the  inequality 

ll*e(t)||  -  \\x(t)\\  <  ||e(t) ||  < 

we  obtain 

||a?e(i) ||  <  \\x{t)\\  +  X°  Gexp(-(/3  -  Grj)t) 

LeoJ 

<2  'r°  G  exp(— (/?  —  Grj)t). 

|_eoJ 

To  this  point,  r/  was  an  arbitrary  constant.  However,  if  we  set  0  <  q  <  (3/G  and  let  e  G  (0,  a] 
(where  a  corresponds  to  this  specific  rj)  be  given  we  find  that  so  long  as 


H  €  Bs( 0) 

_eo_ 

with  5  =  e/(2 G)  (where  G  =  max{l,G}),  then 


and 


xe(t)  ||  <  e 


so  that  both  bounds  are  for  all  t.  Since  the  solution  can  be  continued  to  the  boundary  (and  hence, 
to  Br( 0))  and  the  solution  is  bounded  by  (61)  where  (3  —  Gr]  >  0,  we  can  conclude  that  the  origin 
is  asymptotically  stable.  □ 


5.1  Compensation  Example 

This  example  is  a  simple  demonstration  of  the  SDRE  technique  for  the  control  of  a  system  using 
state  estimator  based  compensation.  We  demonstrate  the  effectiveness  of  the  compensated  system 
by  adding  a  control  term  to  Example  4.1.  Hence,  the  system  is 

X\  =  X2X\  +X2  +  U, 

X2  =  —x\  —  x\, 

V  =  afi- 


In  SDC  form,  the  above  system,  along  with  an  SDRE  compensator,  can  be  rewritten  as 


x 

xe 

y 


0  1  +  x\ 

_-(l  +  xf)  0 

0  1  +  x 

-(i  +  ^ei)  0 

[l  0]  x. 


2 

ei 


u(xe ) 

Q  u(xe)  +  L(xe)(y  -  xei) 
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The  state- dependent  controllability  matrix  for  this  system  is 


M(x) 


1  0 
0  -(1  +  xf) 


and  the  state-dependent  observability  matrix  is 


O(x) 


1  0 
0  1  +  xf 


Since  both  M (x)  and  0(x )  have  full  rank  throughout  3ft2,  the  system  is  controllable  and  observable. 

For  the  weighting  matrices  in  (1)  and  (38)  we  use  Q  =  51,  R  =  1,  U  =  10 1,  and  V  =  0.01. 
The  interpolation  method  provides  a  nice  approach  to  approximate  both  II(x)  and  T(x).  The 
state  estimator  mesh  and  controller  mesh  are  the  same,  Mx  =  {—2  :  0.1  :  2}.  We  use  an  initial 
condition  of  xq  =  (1,1)T  for  the  state  and  xeo  =  (0.5,0.5)r  for  the  estimated  state  and  turn  the 
controller  on  at  t  =  2.0  seconds.  Figures  12(a)  and  12(b)  depict  the  uncontrolled,  controlled,  and 
estimated  states  for  this  example.  We  find  that  the  estimator  converges  to  the  state  quickly  and  the 
controller  stabilizes  the  system.  Figure  12(c)  is  the  state-space  representation,  while  Figure  12(d) 
is  the  error  over  the  time  interval  and  Figure  12(e)  exhibits  the  control  effort  using  state  estimator 
based  compensation. 
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(a) 


(b) 


0.8 

0.7 

0.6 

0.5 


Time  (sec) 


(c)  (d) 


(e) 


Figure  12:  Plots  for  Example  5.1,  (a)  state  x\,  (b)  state  X2 ,  and  (c)  state-space  representations  for  uncontrolled,  controlled, 
and  estimated  ( xe(t ))  systems,  (d)  norm  of  the  error  ||e(t)||2,  and  (e)  compensated  control  for  the  system  over  the  time  span 
0  <  t  <  5  with  state  initial  condition  xq  =  (1, 1)T  and  state  estimator  initial  condition  xeQ  =  (0.5,  0.5)T. 
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6  Conclusion 

In  this  paper  we  have  considered  SDRE  techniques  for  the  general  design  and  synthesis  of  feedback 
controllers,  estimators,  and  compensators  of  nonlinear  systems.  In  particular,  the  SDRE  methods 
were  formulated  for  cost  functionals  with  constant  weighting  coefficients.  The  resulting  closed  loop 
system  was  shown  to  be  locally  asymptotically  stable  and  a  local  convergent  result  was  obtained 
for  the  nonlinear  state  estimator.  In  addition,  two  approaches  were  presented  for  the  numerical 
approximation  of  the  solution  to  the  SDRE  for  a  large  class  of  nonlinear  problems.  The  numerical 
approach  is  based  on  interpolation  of  SDRE  solutions  or  controls  over  the  state  space.  This  approach 
is  very  easy  to  implement  and  was  shown  to  perform  very  well  on  a  wide  class  of  nonlinear  systems. 
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